Search CORE

11 research outputs found

NOSQL design for analytical workloads: Variability matters

Author: Abelló Gamazo Alberto
Herrero Otal Víctor
Romero Moral Óscar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Big Data has recently gained popularity and has strongly questioned relational databases as universal storage systems, especially in the presence of analytical workloads. As result, co-relational alternatives, commonly known as NOSQL (Not Only SQL) databases, are extensively used for Big Data. As the primary focus of NOSQL is on performance, NOSQL databases are directly designed at the physical level, and consequently the resulting schema is tailored to the dataset and access patterns of the problem in hand. However, we believe that NOSQL design can also benefit from traditional design approaches. In this paper we present a method to design databases for analytical workloads. Starting from the conceptual model and adopting the classical 3-phase design used for relational databases, we propose a novel design method considering the new features brought by NOSQL and encompassing relational and co-relational design altogether.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Construcció de cubs de dades usant MapReduce sobre clústers

Author: Herrero Otal Víctor
Publication venue: Universitat Politècnica de Catalunya
Publication date: 22/01/2014
Field of study

En aquest projecte es posen a prova tres algorismes d'accés diferents contra una base de dades clau-valor com l'HBase per tal de trobar aquells factors que poden implicar un millor o pitjor rendiment de cada algorisme i així elaborar un marc d'optimització basat en els costos de cada algorisme

UPCommons. Portal del coneixement obert de la UPC

Modelado de un centro de procesamiento de datos mediante ANSYS Icepak

Author: Marín Herrero José María
Peleato Otal Víctor Manuel
Publication venue: 'Universidad de Zaragoza'
Publication date: 01/01/2015
Field of study

El objetivo general del proyecto es el modelado de un centro de procesamiento de datos, en concreto el del Edificio BIFI del EINA, mediante el uso del programa de cálculo y simulación de fluidos computacional ANSYS Icepak

Repositorio Universidad de Zaragoza

A software reference architecture for semantic-aware big data systems

Author: Abelló Gamazo Alberto
Franch Gutiérrez Javier
Herrero Otal Víctor
Nadal Francesch Sergi
Romero Moral Óscar
Valerio Danilo
Vansummeren Stijn
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Context: Big Data systems are a class of software systems that ingest, store, process and serve massive amounts of heterogeneous data, from multiple sources. Despite their undisputed impact in current society, their engineering is still in its infancy and companies find it difficult to adopt them due to their inherent complexity. Existing attempts to provide architectural guidelines for their engineering fail to take into account important Big Data characteristics, such as the management, evolution and quality of the data. Objective: In this paper, we follow software engineering principles to refine the ¿-architecture, a reference model for Big Data systems, and use it as seed to create Bolster, a software reference architecture (SRA) for semantic-aware Big Data systems. Method: By including a new layer into the ¿-architecture, the Semantic Layer, Bolster is capable of handling the most representative Big Data characteristics (i.e., Volume, Velocity, Variety, Variability and Veracity). Results: We present the successful implementation of Bolster in three industrial projects, involving five organizations. The validation results show high level of agreement among practitioners from all organizations with respect to standard quality factors. Conclusion: As an SRA, Bolster allows organizations to design concrete architectures tailored to their specific needs. A distinguishing feature is that it provides semantic-awareness in Big Data Systems. These are Big Data system implementations that have components to simplify data definition and exploitation. In particular, they leverage metadata (i.e., data describing data) to enable (partial) automation of data exploitation and to aid the user in their decision making processes. This simplification supports the differentiation of responsibilities into cohesive roles enhancing data governance.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Evaluation of data placement optimizations in in-memory databases

Author: Herrero Otal Víctor
Publication venue: Universitat Politècnica de Catalunya
Publication date: 20/10/2016
Field of study

UPCommons. Portal del coneixement obert de la UPC

Construcció de cubs de dades usant MapReduce sobre clústers

Author: Herrero Otal Víctor
Publication venue: Universitat Politècnica de Catalunya
Publication date
Field of study

RECERCAT

Evaluation of data placement optimizations in in-memory databases

Author: Herrero Otal Víctor
Publication venue: Universitat Politècnica de Catalunya
Publication date
Field of study

RECERCAT

NOSQL design for analytical workloads: Variability matters

Author: Abelló Gamazo Alberto
Herrero Otal Víctor
Romero Moral Óscar
Publication venue: Springer
Publication date
Field of study

RECERCAT

Tuning small analytics on Big Data: Data partitioning and secondary indexes in the Hadoop ecosystem

Author: Abelló Gamazo Alberto
Ferrarons Jaume
Herrero Otal Víctor
Romero Moral Óscar
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

In the recent years the problems of using generic storage (i.e., relational) techniques for very specific applications have been detected and outlined and, as a consequence, some alternatives to Relational DBMSs (e.g., HBase) have bloomed. Most of these alternatives sit on the cloud and benefit from cloud computing, which is nowadays a reality that helps us to save money by eliminating the hardware as well as software fixed costs and just pay per use. On top of this, specific querying frameworks to exploit the brute force in the cloud (e.g., MapReduce) have also been devised. The question arising next tries to clear out if this (rather naive) exploitation of the cloud is an alternative to tuning DBMSs or it still makes sense to consider other options when retrieving data from these settings.; In this paper, we study the feasibility of solving OLAP queries with Hadoop (the Apache project implementing MapReduce) while benefiting from secondary indexes and partitioning in HBase. Our main contribution is the comparison of different access plans and the definition of criteria (i.e., cost estimation) to choose among them in terms of consumed resources (namely CPU, bandwidth and I/O).Peer Reviewe

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Tuning small analytics on Big Data: Data partitioning and secondary indexes in the Hadoop ecosystem

Author: Abelló Gamazo Alberto
Ferrarons Jaume
Herrero Otal Víctor
Romero Moral Óscar
Publication venue
Publication date
Field of study

RECERCAT